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Abstract 

Diagnoses  of  students'  performance  on  procedural  mathematical  tasks  need  to  display  a 
certain  level  of  stability  and  robusmess  if  they  are  to  be  used  as  the  basis  for  remediation, 
particulariy  with  computer-delivered  instruction.  The  purpose  of  this  study  was  to  compare 
two  diagnostic  approaches  for  describing  students'  errors  in  algebra  -  -  a  bug  analysis  and  a 
rule-space  analysis  —  with  the  goal  of  investigating  the  relative  stability  of  the  diagnoses 
derived  from  these  approaches.  Consistent  with  the  Endings  of  recent  studies,  a  relatively 
large  number  of  bugs  were  unstable;  stable  bugs  tended  to  be  infrequent.  In  contrast,  the 
results  of  the  rule-space  analysis  yielded  relatively  more  stable  diagnoses.  The  results  were 
discussed  in  light  of  their  consequences  for  designing  remediation. 
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Toward  a  Stable  Diagnostic  Representation  of  Students'  Errors  in  Algebra 

Cognitive  scientists  have  proposed  and  investigated  several  computational 
mechanisms  for  explaining  students'  procedural  errors  in  mathematics,  including  Repair 
theoiy  (Brown  &  Burton  1978;  Brown  &  VanLehn,  1980;  VanLehn,  1990), 
misgeneralization  (Sleeman,  1984a,  1984b),  deletion  (Young  &  O'Shea,  1981),  and  the 
compedng-rules  model  (Payne  &  Squibb,  1990).  Regardless  of  the  adequacy  of  the 
proposed  mechanism  for  accounting  for  how  errors  are  generated  (whether  in  response  to 
an  impasse  or  as  the  result  of  misgeneralizing  a  learned  rule),  a  persistent  concern  about 
existing  models  of  errors  is  their  instability  (VanLehn,  1982;  Sleeman,  Kelly,  Martinak, 
Ward  &  Moore,  1989;  Payne  &  Squibb,  1989). 

In  order  to  investigate  the  stability  of  the  diagnoses  produced  by  mal-rules, 
researchers  have  observed  the  recurrence  of  mal-rules  within  a  test  (Payne  &  Squibb,  1990; 
Blando,  Kelly,  Schneider  &  Sleeman,  1989;  Tatsuoka,  Birenbaum  &  Arnold,  1989)  or 
across  tests  (Payne  &  Squibb,  1990;  Sleeman,  Kelly,  Martinak,  Word  &  Moore,  1989; 
VanLehn,  1982;  Bricken,  1987).  Both  within  and  across  testings,  a  large  number  of  mal- 
rules  have  been  found  to  be  unstable,  and  the  stable  ones  tend  to  be  very  infrequent. 
Consequently,  doubts  have  arisen  regarding  the  potential  usefulness  of  mal-rules  for 
remedial  purposes  (Sleeman,  et  al.,  1989). 

The  kernel  of  the  problem  posed  by  unstable  mal-rules  as  cognitive  models  of  error 
was  articulated  by  VanLehn  (1982,  p.  46):  "[Lack  of  stability]  challenges  us  to  change  our 
image  of  a  bug  as  something  that  necessarily  exists  over  time  as  part  of  the  child's  long 
term  beliefs. . ."  In  other  words,  for  the  purposes  of  remediation  we  cannot  be  confident 
that  a  buggy  analysis  of  a  student's  performance  in  a  mathematics  task  necessarily  produces 
a  stable  student  model.  In  order  for  human  or  machine-delivered  remediation  to  proceed  on 
a  reliable  basis,  a  stable  diagnosis  is  a  necessary,  if  not  sufficient,  prerequisite. 
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An  alternative  approach  to  error  diagnosis  is  to  refocus  attention  to  the  source  of  the 
impasse  that  causes  buggy  behavior  (stable  or  unstable)  on  the  part  of  the  student,  rather 
than  attempting  to  model  the  cognitive  response  to  the  impasse.  For  example,  a  number  of 
mal-rules  have  been  identified  when  students  are  confionted  with  an  equation  in  the  form 
ax  =  b,  including  x  =  b  (Sleeman  et  al.,  1989),  x  =  b  -  a  (Sleeman  et  al.,  1989;  Payne  & 
Squibb,  1990),  x  =  -(a  +  b)  (Gutvirtz,  1989),  x  =  a  -  b  (Gutvirtz,  1989),  and  x  =  a  +  b 
(Gutvirtz,  1989;  Payne  &  Squibb,  1990).  What  each  of  these  bugs  has  in  common  is  that 
each  is  a  response  to  the  students'  nonmastery  of  the  subskill  of  dividing  across  by  the 
coefficient  of  x.  The  cause  of  the  impasse  is  the  nonmastered  subskill. 

As  noted  by  VanLehn  (1982),  it  is  extremely  difficult  to  tease  out  of  a  set  of  items 
the  presence  or  absence  of  subskills  using  the  pattern  of  right  and  wrong  answers.  The  rule 
space  technique,  developed  by  Tatsuoka,  was  designed  to  handle  this  problem  (e.g., 
Tatsuoka,  1983, 1985, 1990, 1991;  Tatsuoka  &  Tatsuoka,  1987).  The  rule-space 
classifies  students  into  knowledge  states  that  consist  of  response  patterns  that  are  described 
in  terms  of  mastery  or  nonmastery  of  predetermined  task  attributes.  The  analysis  collapses 
across  items,  and  classifies  students  according  to  factors  (subskills  in  this  case)  that  are 
identified  to  be  integral  to  the  successful  completion  of  an  item  or  subsets  of  items.  In  this 
paper  we  report  on  the  results  of  a  rule  space  analysis  of  students'  performance  on  linear 
equations  in  one  unknown  in  which  the  "attributes”  were  described  at  the  level  of  the 
source  of  the  student's  errors  (e.g.,  "has  not  mastered  the  distributive  law"). 

More  technically,  rule-space  is  a  probabilistic  approach  whose  purpose  is  to  identify 
the  examinee's  state  of  knowledge,  based  on  an  analysis  of  the  task's  cognitive 
requirements.  The  following  is  a  brief  presentation  of  the  rule-space  approach: 

First  the  task's  cognitive  requirements  (also  called  attributes)  are  specified.  From 
these,  an  item  x  attribute  incidence  matrix,  Q,  is  constructed.  This  matrix  is  binary  and  of 
order  K  x  m  (the  number  of  attributes  x  the  number  of  items).  If  qkj  is  the  (k  J)  element  of 
this  matrix  (where  k  indicates  an  attribute  and  j  indicates  an  item)  then,  qkj=l  if  item  j 
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involves  attribute  k,  and  qkj=0  otherwise.  Concepts  represented  by  unobservable  variables 
that  can  be  derived  from  the  incidence  matrix  Q  arc  called  cognitive  states  (or  attribute 
patterns).  Boolean  Description  Functions  are  used  systematically  to  determine  those 
cognitive  states  and  map  them  into  observable  item-score  patterns  (called  ideal  item-score 
patterns)  (see  Tatsuoka,  1991;  Varadi  &  Tatsuoka,  1989).  Once  the  ideal  item-score 
patterns  are  obtained,  the  actual  data  are  considered. 

The  rule  space  then  maps  the  actual  item-score  patterns  of  the  examinees  onto  the 
cognitive  states  in  ordb*  to  find  the  ideal  item-score  pattern  closest  to  a  given  student's 
actual  response  pattern.  This  pattern  classification  problem  is  handled  by  the  rule-space 
model.  Item  Response  Theory  (IRT)  is  utilized  for  formulating  the  classification  space, 
which  is  a  (Cartesian  product  space  of  IRT  ability^roficiency,  6,  and  variable(s),  C.  which 
measure  the  unusualness  of  item-score  patterns  (Tatsuoka,  1984;  Tatsuoka  &  Linn,  1983). 
Bayes'  decision  rules  are  used  for  the  classification  of  an  examinee  into  the  cognitive  states. 
Once  this  classification  has  been  carried  out,  one  can  indicate  which  attributes  a  given 
examinee  is  likely  to  have  mastered  or  failed  to  master. 

The  present  study  examined  the  stability  of  the  diagnostic  models  produced  by  rule 
space  and  those  produced  by  a  bug  analysis.  Rule  space  and  buggy  analyses  were  applied 
to  two  sets  of  algebra  items  that  were  designed  to  be  parallel  in  terms  of  their  attributes 
(task  requirements). 

Methodology 

Subjects 

The  sample  consisted  of  231  8th  and  9th  graders  (ages  14-15)  from  an  integrated  junior 
high  school  in  Tel  Aviv.  Fifty-seven  percent  of  the  subjects  were  girls.  The  students  studied 
matlwmatics  in  high  and  low  achievement  groupings  (106  in  the  former  and  125  in  the  latter). 
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Instruments  and  procedures 

A  32-item  diagnostic  test  in  linear  algebraic  equations  in  one  unknown  was  developed  by 
Gutvirtz  (1989)  based  on  a  detailed  task  analysis  including  a  procedural  network  and  a  mapping 
sentence  (e.g.  Birenbaum  &  Shaw,  1985).  The  test  was  developed  for  the  purpose  of  identifying 
students'  bugs  in  solving  those  equations.  All  items  were  open-ended  and  the  students  were  asked 
to  show  all  solution  steps.  The  present  study  used  a  subset  of  those  items  which  consisted  of  two 
sets  of  nine  parallel  items  attribute-wise:  in  set  1  (items  1, 2, 3, 6, 8, 10, 1 1, 12, 13);  in  set  2 
(items  25, 24, 27, 23, 18, 19, 20, 22, 30).  (The  18  items  appear  in  Appendix  A). 

The  correlation  coefficient  between  the  scores  on  the  two  sets  was  0.85.  The  item 
difficulty  indices  (percent  correct)  in  set  1  (items  1, 2, 3, 6,  8, 10, 1 1, 12, 13)  ranged  from  0.63  to 
0.93  with  an  average  of  0.78.  In  set  2  (items  25, 24,  27, 23,  18, 19,  20,  22,  30)  the  range  was 
from  0.53  to  0.91  with  an  average  of  0.76.  The  item  discrimination  indices  (item-total 
correlations)  in  set  1  ranged  frtMn  0.49  to  0.75,  with  an  average  of  0.61.  In  set  2  the  range  was 
from  0.51  to  0.73,  with  an  average  of  0.61.  The  correlation  coefficients  between  the  two  sets  with 
respect  to  item  difficulties  and  item  discrimination  indices  were  0.93  and  0.82,  respectively. 

The  bug  analysis: 

On  the  basis  of  a  detailed  examination  of  the  procedures  followed  by  the  students  in 
solving  the  test  items,  34  mal-rules  (bugs)  were  identified  (see  Gutvirtz,  1989  for  a  listing 
of  the  bugs).  A  bug  X  item  matrix  was  then  constructed.  The  entries  of  this  matrix  were 
the  answers  to  the  test  items  produced  by  applying  the  mal-rules.  The  students'  actual 
answers  were  then  matched  to  the  entries  in  the  bug  matrix  and  coded  accordingly.  Of  the 
actual  responses,  94.6%  were  matched  to  identitied  bugs  or  to  the  correct  rule,  the  rest 
were  cither  unidentified  bugs  or  clerical  errors.  Of  the  231  subjects,  50  answered  all  18 
items  correctly,  and  were  therefore  excluded  from  subsequent  analysis.  The  coded 
responses  included  38  different  codes:  one  indicating  the  correct  answer,  one  indicating 
unidentifred  errors,  one  indicating  clerical  errors,  one  indicating  omissions,  and  the  rest 
indicating  the  various  identified  bugs.  The  codes  for  parallel  items  were  then  compared. 
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Matches  and  mismatches  were  counted  across  the  nine  pairs  of  parallel  items  for  each  of  the 
181  examinees,  and  classified  according  to  the  following  primary  categories:  (a)  matched 
correct  (1,1);  (b)  one  correct  and  one  error  (1,0;  0,1);  (c)  matched  bug;  and  (d)  nonmatched 
errors  (nonmatched  bugs  or  unidentified  errors). 

The  rule-space  analysis: 

1.  Determining  the  attributes:  A  set  of  1 1  attributes  was  specified  for  a  solution  strategy 
for  solving  the  items  (see  Table  1)  and  used  to  produce  an  incidence  matrix  (see  Appendix  A).  For 
example,  the  following  attributes  are  appropriate  for  item  10  (note  that  "evaluating"  means  that  the 
student  decides  fnmi  the  outset  not  to  rewrite  tt«  equatitxi  in  standard  from  until  the  final  step  ~ 
thereby  avoiding  a  negative  x-term): 

4(2x  +  3)  =  lOx  ("evaluating"  the  equation  and  applying  the  distributive  law) 

8x  +  12  =  lOx  (subtracting  a  term  from  both  sides) 

12  =  lOx  •  8x  (adding  or  subtracting  variable  terms) 

12  =  2x  (dividing  across  by  the  coefficient  of  x,  when  a<b) 

6  s  X  (applying  the  symmetry  law) 

X  =  6 

See  the  operations  denoted  for  item  10  in  Appendix  A,  and  the  attribute  list  in  Table  1 . 


Insert  Table  1  about  here 


2.  Testing  the  adequacy  of  the  attribute  matrix:  A  multiple  regression  with  item  difficulties 
as  the  dependent  variable  and  the  1 1  attribute  vectors  of  the  Q  matrix  as  the  independent  variables 
was  performed.  The  set  of  attributes  accounted  for  94%  of  the  variance  (r2=.94;  R2adj=.89). 

3.  The  BILOG  program  (Mislevy  &  Bock,  1983)  was  used  for  estimating  the  item 
parameters  (a's  and  b's)  of  the  IRT  two-parameter  logistic  model.  The  b  values  for  the  first 
subtest  correlated  0.90  with  the  b  values  for  the  second  subtest  The  correlation  for  the  a 
values  of  the  two  subsets  was  0.7S.  The  b  values  of  the  first  and  second  subtests  ranged 
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from  -2.12  to  -.26  and  from  -1.90  to  .04.  respectively.  The  a  values  of  the  Erst  and  second 
subtests  ranged  from  .68  to  1.S2  and  from  .72  to  l.SS,  respectively. 

4.  The  BUGLIB  program  (Varadi  &  Tatsuoka,  1989)  was  used  for  deriving  the  ideal  score 
patterns  corresponding  to  the  attribute  mastery  patterns  that  constituted  the  groups  into  which  the 
students'  actual  response  patterns  were  classified.  As  a  result,  78  groups  (knowledge  states)  were 
generated.  The  same  program  was  also  used  for  the  classification.  The  classification  was  applied 
to  each  subset  of  items  separately;  that  is,  each  student  was  classified  twice,  once  according  to  his 
or  her  resptxises  to  set  1,  and  once  according  to  the  responses  to  the  parallel  set,  set  2. 

5.  The  results  of  the  classificatior  s  (i.e.,  the  students'  attributes  patterns  on  the  two 
sets  of  1 1  attributes)  were  then  compared.  Of  the  231  subjects,  SO  answered  all  18  items 
correctly,  and  4  answered  all  items  incorrectly;  thus  54  subjects  were  therefore  excluded 
frcmi  subsequent  analysis.  Matches  and  mismatches  were  counted  across  the  1 1  pairs  of 
attributes  for  each  of  the  177  examinees  and  classified  according  to  the  following  primary 
categmies:  (a)  matched  mastery  (1,1);  (b)  mastery/nonmastery  (1,0;  0,1);  and  (c)  matched 
nonmastery  (0,0). 

Results 

Mal-rule  stability 

Before  presenting  the  results  at  the  group  level,  two  examples  of  the  bug  analysis  for  the 
two  parallel  sets  of  items  for  two  students  are  presented  in  Table  2.  A  comparison  of  the  two  row- 
vectors  for  the  first  student  (No.  13)  indicated  that  he  consistently  answered  correctly  one  pair  of 
parallel  items  and  consistently  applied  incorrect  rules  on  five  pairs  of  items.  On  the  remaining  two 
pairs  of  items  he  inconsistently  applied  different  mal-rules,  and  on  one  pair  he  omitted  the  response 
to  (Hie  item.  Thus  the  percentage  of  matched  correct  responses  for  this  student  was  1 1.1 1%,  the 
percentage  of  matched  bugs  was  55.56%,  the  percentage  of  non-matched  errcHs  was  33.33%. 

The  second  student  (No.  82)  also  correctly  answered  one  pair  of  parallel  items  (11.1 1%), 
she  (insistently  applied  the  same  bug  to  four  pairs  (44.44%),  and  the  percentage  of  unmatched 
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errors  was  44.44%.  In  no  case  did  either  of  the  students  get  one  of  the  items  in  a  pair  correct  and 
the  other  item  incorrect 


Insert  Table  2  about  here 


It  should  be  noted  that  although  the  two  students  had  the  same  pattern  of  crarectAncorrect 
answers,  dieir  bugs  differed  in  type  and  frequency.  While  the  first  student  was  consistently 
applying  three  mal-rules  [A:  a  +  x  =>  ax;  B:  ax  +  a  =>  (a  +  a)  x;  and  C:  ax  =  b  =>  x  =  a/b  (when  a 
>  b)],  the  second  student  consistently  applied  only  one  mal-rule  [F:ax*b  =  c=>ax  =  c(Sb; 
when  •  is  "+"  then  @  is  and  vice  versa]. 

Evaluated  at  the  group  level,  64.58%  of  the  total  matched  responses  across  the  9 
pairs  of  items  were  matched  correct  answers.  A  further  18.97%  included  one  correct  and 
one  incorrect  response,  and  6.38%  were  nonmatched  errors  (including  nonmatched  bugs 
and  unidentified  errors).  The  remaining  10.07%  of  the  total  matched  responses  were 
matched  bugs.  To  better  understand  this  final  percentage,  note  that  for  the  right/wrong 
scoring  the  overall  match  of  correct  (1,1)  and  incorrect  (0,0)  responses  was  81.03%, 

(64.58%  matched  correct  and  16.45%  matched  incorrect).  Thus,  of  the  incorrect  pairs 
(0,0),  61%  consisted  of  matched  bugs.  Greater  insight  into  the  percentage  of  matched 
bugs  may  be  gained  by  inspecting  Table  3.  This  table  presents  the  frequency  of  stable  bugs 
for  each  pair  of  parallel  items.  As  can  be  seen,  the  thirty-four  stable  bugs  are  sparsely 
distributed  across  the  nine  item-pairs. 


Insert  Table  3  about  here 
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Attribute  stability. 

Before  presenting  the  results  at  the  group  level,  the  following  is  an  example  of  the 
rule-space  analysis  at  the  individual  level.  The  example  is  based  on  the  responses  given  by 
the  two  students  whose  bug  analyses  were  presented  above.  Since  both  answered  correctly 
the  same  pair  of  items,  (No.  4  in  each  subset)  and  erred  on  all  the  other  items,  their  attribute 
mastery  pattern  is  identical.  The  two  vectors  of  1 1  attributes  for  these  students,  as  derived 
fr(xn  their  responses  to  the  two  parallel  subsets,  are  presented  in  Table  4.  A  comparison  of 
the  two  row-vectors  indicates  that  they  are  identical;  i.e.,  they  reflect  the  same  knowledge 
state.  Thus,  for  both  students,  the  percentage  of  matched  mastery  attributes  (1,1)  is 
18.18%,  the  percentage  of  matched  nonmastory  is  81.82%  and  that  of  one  mastery  and  one 
nonmastery  is  0.00%.  The  students'  response  pattern  to  the  test  items  perfectly  matched 
the  knowledge  state  indicating  mastery  of  only  two  attributes  (9  and  1 1,  see  Table  1),  and 
nonmastery  of  all  the  rest 


Insert  Table  4  about  here 


At  the  group  level  the  percentage  of  matched  and  nonmatched  responses  across  the 
1 1  pairs  of  attributes  are  as  follows:  80.18%  of  the  responses  yielded  a  match  [63.38%  of 
the  responses  for  mastery  and  16.80%  for  nonmastery  (0,0)].  The  percentage  of 
nonmatched  attributes  [mastery/non  mastery  or  (1,0),  (0,1)  patterns]  was  19.82%.  The 
correlation  coefficient  between  the  mastery  scores  derived  from  the  two  subsets  in  the  total 
sample,  which  is  an  index  of  the  reliability  of  these  scores,  was  0.79.  Note  that  at  the  item 
level  (0/1  scores)  that  coefficient  was  0.85.  The  percentage  of  mastery  for  each  attribute 
may  be  found  in  Appendix  A. 
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Discussion 

The  results  of  the  present  study  showed  that  a  rule  space  analysis  of  attributes 
defined  in  terms  of  the  subskiU  components  of  a  procedural  task  produced  a  relatively 
stable  within-test  student  model.  On  the  bug-level,  although  our  analysis  found  more 
stable  bugs  than  were  previously  reported  during  a  single  testing  session  (see  data  on 
School  3  in  Payne  and  Squibb,  1989),  many  bugs  had  very  low  bnequencies.  While  an 
umastered  skill  is  likely  to  remain  unmastered  (without  intervening  tutoring),  the  impasse 
that  results  from  it  may  trigger  many  buggy  responses  (some  stable  and  infrequent,  and 
many  unstable).  For  the  same  reason,  a  measure  of  mastery/nonmastery  of  a  subskill  is 
likely  to  demonstrate  stabiilicy  across  testings  (and  be  more  stable  than  a  corresponding 
buggy  analysis),  but  this  prediction  needs  to  be  tested  empirically. 

Advantages  of  Attribute  Analyses  over  Bug  Analyses 

1.  A  clear  advantage  of  focusing  on  the  deficient  subskills  (as  attributes)  is  that 
they  are  known  mathematical  entities.  Ck)nsequentiy,  remedial  prescriptions  for  the  teacher 
are  in  terms  that  are  immediately  meaningful  for  them  (see  Putnam,  1987).  Bugs,  on  the 
other  hand,  are  often  a  mystery  both  to  the  researcher  and  the  teacher  because,  "many  bugs 
have  conditions  and  acticMis  that  simply  do  not  appear  in  guy  arithmetic  algorithm ..." 
(VanLehn,  1990,  p.  6,  original  emphasis). 

2.  The  identified  attributes  are  integral  subcomponents  of  the  task;  thus  if  a  student 
fails  the  task,  the  failure,  at  least  at  the  procedural  level,  must  be  traceable  to  one  or  more 
deficiencies  in  these  subskills  (if  the  subskill  analysis  was  exhaustive).  The  generative 
nature  of  bugs,  on  the  other  hand,  means  that  a  given  catalog  of  bugs  may  explain  errors 
for  the  data  reported  in  one  study,  but  not  in  another  and,  within  the  same  study,  bugs 
applicable  in  one  school  may  not  be  applicable  in  a  different  school  (Payne  &  Squibb, 
1989).  The  capriciousness  of  bugs  can  lead  to  inaccurate  diagnoses  of  mathematical  errors 
(Sleeman  ct  al.,  1989;  VanLehn,  1990). 
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3.  As  a  consequence  of  the  above  advantages  of  attributes,  remedial  scripts  for 
subsldll  deficiencies  can  be  prepared  beforehand.  These  scripts  may  be  based  on  the 
recommendations  of  experienced  teachers,  culled  from  published  studies,  or  stem  from  the 
tutors'  "best  guesses"  about  successful  remedial  strategies.  A  study  using  rule  space  as  the 
basis  for  remediation  has  produced  positive  results  (Tatsuoka  &  Tatsuoka,  1992).  Since 
bugs  may  be  produced  capriciously,  it  is  a  daunting,  if  not  impossible,  task  to  prescribe 
remediation. 

4.  Finally,  it  is  very  labor  intensive  for  teachers  and  researchers  to  identify, 
catalog,  and  diagnose  mal-rules  [VanLehn  (1982)  notes  that  three  or  four  thousand  hours 
were  given  to  hand  analyses  of  protocols].  And  even  with  this  expensive  input  there  is  no 
guarantee  that  all  of  the  possible  mal-rules  will  be  found  (Sleeman  et  al.,  1989;  Payne  & 
Squibb,  1989;  VanLehn,  1982).  VanLehn  (1982,  p.  46)  noted  that  even  with  "excellent 
tests,  an  improved  DEBUCX3Y,  and  a  dedicated  staff  of  experienced  diagnosticians,"  34% 
of  the  population  of  students  could  not  be  diagnosed  in  terms  of  bugs  and  slips.  VanLehn 
further  noted  that  the  remedial  consequences  of  poor  diagnosis  for  remediation  purposes  is 
that  the  computer  system  has  then,  "nothing  informative  to  tell  the  teacher  about  the 
student"  (p.  37,  original  emphasis). 

While  we  are  pleased  with  the  within-test  stability  results  fw  the  rule-space 
analysis,  future  studies  should  investigate  the  stability  of  the  rule-space  results  over  time. 

In  addition,  cognitive  models  for  algebra  other  than  the  subskill  model  described  here 
should  also  be  investigated. 
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Table  1 

Attributes  Used  in  the  O  Matrix. 

No.  Description 

1  Adding  a  term  to  both  sides  of  the  equation 

2  Subtracting  a  term  from  both  sides  of  the  equation 

3  Applying  arithmetic  order  of  operations 

4  Applying  the  distributive  law 

5  Adding  or  subtracting  variable  terms 

6  Dividing  across  by  the  coefficient  of  x,  [resulting  in  x=b/a  when  a=b] 

7  Dividing  across  by  the  coefficient  of  x,  [resulting  in  x=b/a  when  a<b] 

8  Dividing  across  by  the  coefficient  of  x,  [resulting  in  x=b/a  when  a>b] 

9  Applying  symmetry  law 

10  Evaluating  the  equation  to  determine  the  simplest  solution  path 

1 1  Applying  symmetry  law  and  evaluating  the  equation  to  determine  the  simplest 
solution  path 
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Table  2 

Examples  of  two  Students  Bug  Patterns  for  the  Nine  Parallel  Item-Pairs 


Item  sets 

Item-Pairs 

1 

2 

3 

4 

5 

6 

7 

8 

9 

Student  #  13 

First  set 

A 

B 

C 

+ 

B 

Ui 

D 

B 

Ui 

Second  set 

A 

B 

C 

+ 

B 

Om 

E 

B 

a 

Student  #  82 

First  set 

a 

F 

Ui 

+ 

F 

a 

F 

F 

F 

Second  set 

Ui 

F 

C 

F 

G 

F 

F 

Ui 

Note. 


+  =  Correct  response 
Mal-rules: 

A:  a  +  X  =>  ax 

B:  ax  +  a  =>  (a  +  a)  X 

C:  ax  =  b  =>  X  =  a/b  (when  a  >  b) 

D:  ax  +  b  +  X  =>  (a  +  b  +  1)  X 
E;  ax  +  b  =>  (a  +  b)  X 

F:ax*b  =  c=>ax  =  c@b;  when  •  is  "+"  then  @  is  and  vice  versa, 

G:  ax  •  bx  =  cx  =>  a  =  cx  @  bx;  when  •  is  then  @  is  and  vice  versa. 
Other  errors: 

Q:  Qerical  error 
Ui:  Unidentified 
Om:  Omitted 
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Tables 

Frequency  of  Stable  Bugs  bv  Item-Pairs 


Item-Pairs 


Bug 

No. 

1&25 

2&24 

3&27 

6&23 

8&18 

10&19 

11&20 

12&22  13&30 

2 

8 

3 

1 

4 

1 

4 

4 

7 

2 

9 

30 

10 

2 

1 

14 

1 

18 

1 

19 

10 

20 

4 

21 

1 

24 

1 

2 

1 

26 

2 

1 

2 

1  6 

28 

3 

6 

4 

3 

3 

3 

30 

2 

32 

1 

10 

1 

33 

3 

34 

12 

10 

46 

2 

48 

1 

2 

SI 

1 

52 

1 

59 

1 

1 

1 

1 

1 

1 

1 

63 

1 

75 

1 

98 

1 

102 

1 

2 

104 

1 

106 

1 

116 

1 

1 

117 

1 

1 

1 

1 

121 

1 

130 

1 

1 

131 

1 

1 

No.  of  5  11 

different  bugs 

9 

6 

6 

5 

12 

6 

5 

ftcquency  23  17 

54 

10 

12 

8 

25 

11 

20 
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Table  4 

Attribute  Mastery  Patterns  for  Students  13  and  82. 


Attribute  1  234567  89  10  11  Knowledge  State 


Subset  1  0  0  0  0  0  0  0  0  1  0  1  74  0.0 

Subset  2  0  0  0  0  0  0  0  0  1  0  1  74  0.0 


Note:  The  distance,  D^,  is  the  Mahalonobis  Distance  from  the  student's  point  to  the 
centroid  of  the  closest  group  cm  the  6  and  ^  axes. 
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Appendix  A 

The  Incidence  Matrix.  O.  for  the  18  items,  the  Item  Difficulties  and  Discrimination  Indices,  and  th( 
Percentage  of  Mastery  for  Each  Attribute 


Items 

Attribute 

IRT 

1 1 

12345678901 

%  Correct 

b 

a 

1 

3+x=6+3*2 

01  10  0000000 

74 

.71 

-1.00 

25 

4+x=6+2*3 

01  10  0000000 

73 

.72 

-.94 

2 

7x+7=14 

0100  0100000 

81 

1.00 

1.18 

24 

12x-«-12=24 

0100  0100000 

81 

1.12 

-1.08 

3 

16x=4 

0000  0001000 

63 

1.28 

-.26 

27 

28x=7 

0000  0001000 

54 

1.13 

.04 

6 

35=7x 

0000  0010100 

93 

1.20 

-2.12 

23 

24=6x 

0000  0010100 

92 

1.29 

-1.90 

8 

3+6x=18 

0100  0010000 

77 

1.17 

-.85 

18 

8-f4x==26 

0100  0010000 

85 

1.30 

-1.25 

10 

4(2x+3)=10x 

0101  1010111 

83 

1.52 

-1.05 

19 

6(x+3)=12x 

0101  10101 1 1 

81 

1.04 

-1.13 

11 

6+4x+x=22 

0100  1010000 

77 

1.38 

1 

00 

20 

5+3x+x=16 

0100 1010000 

76 

1.35 

-.74 

12 

98=7+7x 

0100  0010100 

83 

1.39 

-1.07 

22 

75=5+5x 

0100  0010100 

84 

1.55 

-1.07 

13 

x-4=4+2*4 

1010  0000000 

73 

.68 

-.98 

30 

x-6=3+5*3 

1010  0000000 

67 

.74 

-.61 

%Mastered 


6966  9595987 
4449  596  1  597 
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